seo

Do Sitemaps Affect Crawlers?

Like any other person out there, I fall into habits, good and bad.  Recently while working on a client’s website, I created a Sitemap and submitted it to the search engines, like I always do.  I started to think if this really helps the site out and what’s the effect when I submit a Sitemap on the site.

I approached one of my clients who has a semi popular blog and uses WordPress and the Google XML Sitemaps Generator plugin for WordPress.  I asked for permission to install my tracking script on their site to track the whereabouts of the bots.  For those of you who don’t know what the Google XML Sitemaps Generator is, every time you edit or create a post on WordPress it creates a new sitemap and submits it to the major search engines.

My client is good at posting new content to their blog, usually around 2 or 3 posts a week.  The script that I installed on their website was written in PHP and tracked every time a bot accessed the Sitemap, every time the Sitemap was submitted, and every page it crawled on the website.  The script stored this information in a MySQL database along with a timestamp, IP address, and the user agent.  I also modified the Sitemap generator to insert a timestamp every time the sitemap was submitted to the search engines.

Onto the data!

The experiment was to see if submitting a Sitemap to Google and Yahoo would decrease the time it took Google to crawl and index the page.  The results for this blog were amazing!  When a Sitemap was submitted the average time it took for the bot to visit the new post was 14 minutes for Google and 245 minutes for Yahoo.   When no Sitemap was submitted and the bot had to crawl to the post, it took 1375 minutes for Google and 1773 for Yahoo.   The averages were calculated on 12 different posts, 6 with Sitemaps being submitted, and 6 with the Sitemaps not being submitted.

Crawl Time - No Sitemap

Crawl Time - Sitemap Submitted

After calculating the data, I thought there had to be a mistake.  I went to a few of my sites (GR Web Designs and Grand Haven Football) and quickly created new posts and submitted a Sitemap to Google and Yahoo.  I checked my tracking script 30 minutes later and Google had already been there and the new posts were indexed. Yahoo followed shortly after Google did also. 

After seeing how long it took the bot to crawl without a Sitemap, I figured there was a problem with the structure of the website and the bots couldn’t crawl to the new pages.  When I looked at the site and had others look into the crawlability, we found no problems.  I also looked and found that the bot assessed the page where the new links pointed to the new posts but never went on to crawl the page until later. 

I was doing research for this post and found Rand’s post titled “My Advice on Google Sitemaps – Verify, but Don’t Submit,” and I found myself perplexed.  Why would Rand tell me not to submit my Sitemap when I received such great results from it?  After rereading the post, I found that he was more interested in getting the valuable crawl data.  Granted that I’m using WordPress and know that all my pages are crawlable, why wouldn’t I submit the Sitemap, especially if I’m going to get results like above?

For sites like the one in the experiment, that know their site has no issues with the natural crawl, I would suggest that they submit a Sitemap because it will lead to a faster crawl and inclusion in the indexes.  If you have a site where you are unsure if your link structure is correct, I would suggest that you do NOT submit a Sitemap.  This will help you determine whether or not you have problems.  For all those people out there who have websites that have great link structure, why not help get things going faster and submit a Sitemap to Google and Yahoo today.

I would love to hear what the SEOmoz community has to say about their use of Sitemaps.  Remember, this experiment was only completed on one site and I might do further study on the use of Sitemaps if I get a good response from all of you.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button